Skip to content

feat(ARSN-572): Propagate W3C trace context to MongoDB metadata writes#2611

Open
delthas wants to merge 1 commit into
development/8.3from
improvement/ARSN-572/trace-context
Open

feat(ARSN-572): Propagate W3C trace context to MongoDB metadata writes#2611
delthas wants to merge 1 commit into
development/8.3from
improvement/ARSN-572/trace-context

Conversation

@delthas
Copy link
Copy Markdown
Contributor

@delthas delthas commented Apr 13, 2026

Context

Cloudserver recently gained OpenTelemetry tracing (CLDSRV-884, cloudserver PR scality/cloudserver#6140). That gives us a single end-to-end trace per S3 request spanning cloudserver → vault → hdproxy → hyperiod.

But many state mutations in the S3 stack are written to MongoDB and consumed asynchronously by worker services via the oplog:

  • Backbeat — replication, lifecycle expiration/transition, garbage collection
  • Sorbet — cold-storage restore completion
  • Other oplog consumers

Today the oplog entry carries no trace context, so async work performed hours or days after the S3 request cannot be tied back to the request that caused it. A user's POST …?restore produces a trace that ends at the metadata write; the actual cold-tier retrieval by Sorbet is invisible from the original trace.

Solution

Stamp a W3C trace context (traceparent/tracestate) on every mutating metadata write, auto-derived from the currently-active OTEL context.

The field lives on ObjectMDData next to originOp:

traceContext?: {
    traceparent?: string;
    tracestate?: string;
};

A new helper captureCurrentTraceContext() reads the active OTEL context via @opentelemetry/api and serializes it via propagation.inject(). It's invoked at the three MongoClientInterface sites that currently set originOp:

  • internalPutObject
  • repair
  • internalDeleteObject

Consumers (backbeat, sorbet — out of scope for this PR) will later:

  1. Read value.traceContext.traceparent from the oplog entry
  2. propagation.extract() it to reconstruct a parent context
  3. Wrap per-entry processing in context.with(extractedCtx, ...)

This closes the loop on end-to-end tracing across the async boundary.

Design decisions

1. Auto-inject at the arsenal write boundary (vs. param plumbing).
The alternative was to thread a new traceContext parameter through the ~15 cloudserver call sites that already set originOp. Rejected because:

  • @opentelemetry/context-async-hooks (loaded by the OTEL SDK at cloudserver startup) already propagates the request context through arsenal's async call chain for free. Re-reading it at the mongo write is idiomatic OTEL.
  • Zero cloudserver surface change — we can iterate on arsenal independently.
  • Parallels existing behavior where arsenal reads originOp out of params and attaches it to the ObjectMD.
  • Avoids replicating the maintenance cost that makes originOp easy to forget on new API endpoints today.

2. Only @opentelemetry/api as a runtime dep (~10 KB, no SDK).
The api package is specifically designed to be safe for libraries to depend on. When no SDK is registered (today's state for anyone not opting into cloudserver OTEL), context.active() returns a no-op context, trace.getSpan() returns undefined, and captureCurrentTraceContext() returns undefined. setTraceContext(undefined) no-ops; the field stays absent in the write. Zero runtime cost when OTEL is off.

3. Optional field, no schema migration.
traceContext is absent by default — consistent with other optional ObjectMD fields like legalHold, retentionDate. No default-value change in the ObjectMDData construction. Existing consumers that don't understand the field simply ignore it; fully backward compatible.

4. No-op when no traceparent is present.
setTraceContext(undefined) and setTraceContext({ tracestate: 'x' }) (no traceparent) both no-op — the field only appears when the caller had an active trace. This prevents junk entries from any edge cases in the OTEL API.

Interaction with async flows (cold restore as example)

Cold storage restore exercises both a sync and an async metadata write:

  1. ObjectRestore:Post (sync, user-initiated via POST /bucket/key?restore) — cloudserver sets originOp; arsenal now also stamps traceContext pointing to the user's request trace. ✅
  2. ObjectRestore:Completed (async, Sorbet-triggered via cloudserver's /_/backbeat/* route) — behavior depends on Sorbet:
    • If Sorbet forwards traceparent on its HTTP call (ideally extracted from step 1's stored traceContext): arsenal picks it up via context.with(remoteContext, ...) in cloudserver's server.js, and the Completed write ties back to the original user restore. ✅
    • If Sorbet calls without traceparent (today's state): safely no-ops — Completed write has no traceContext, no regression. ✅

Same pattern applies to lifecycle expiration/transition, CRR replication, and GC.

Verification

Unit tests

tests/unit/models/ObjectMD.spec.js — 4 new tests covering:

  • Valid traceparent round-trips through set/getTraceContext
  • setTraceContext(undefined) is a no-op
  • setTraceContext({ tracestate: ... }) without traceparent is a no-op
  • getValue() serializes traceContext only when set

All 128 ObjectMD tests pass (was 124 + 4 new).

End-to-end (follow-up on a test cluster)

  1. Bump cloudserver's arsenal pin to a build of this branch; roll the cloudserver deployment on an Artesca cluster with ENABLE_OTEL=true
  2. Make a PutObject; note the trace ID in Jaeger
  3. Inspect the MongoDB oplog: o.value.traceContext.traceparent should be present, with characters 3-35 matching the Jaeger trace ID
  4. Negative test with ENABLE_OTEL=false → field should be absent

Issue: ARSN-572

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Apr 13, 2026

Hello delthas,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@bert-e
Copy link
Copy Markdown
Contributor

bert-e commented Apr 13, 2026

Request integration branches

Waiting for integration branch creation to be requested by the user.

To request integration branches, please comment on this pull request with the following command:

/create_integration_branches

Alternatively, the /approve and /create_pull_requests commands will automatically
create the integration branches.

@delthas delthas force-pushed the improvement/ARSN-572/trace-context branch from d42f30b to 01693cd Compare April 13, 2026 15:38
Comment thread lib/models/ObjectMD.ts Outdated
Comment thread lib/storage/metadata/mongoclient/MongoClientInterface.ts Outdated
@scality scality deleted a comment from bert-e Apr 13, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 13, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 73.54%. Comparing base (1acb14f) to head (8619151).

Additional details and impacted files
@@                 Coverage Diff                 @@
##           development/8.3    #2611      +/-   ##
===================================================
+ Coverage            73.48%   73.54%   +0.05%     
===================================================
  Files                  222      223       +1     
  Lines                18183    18215      +32     
  Branches              3762     3767       +5     
===================================================
+ Hits                 13362    13396      +34     
+ Misses                4816     4814       -2     
  Partials                 5        5              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@delthas delthas force-pushed the improvement/ARSN-572/trace-context branch from 01693cd to ca456c9 Compare April 13, 2026 15:56
Comment thread lib/storage/metadata/captureTraceContext.ts
@delthas delthas force-pushed the improvement/ARSN-572/trace-context branch from ca456c9 to a30bce8 Compare April 14, 2026 07:19
Comment thread lib/storage/metadata/captureTraceContext.ts
@delthas delthas force-pushed the improvement/ARSN-572/trace-context branch from a30bce8 to 39233c8 Compare April 14, 2026 07:23
Comment thread lib/storage/metadata/captureTraceContext.ts
@delthas delthas force-pushed the improvement/ARSN-572/trace-context branch from 39233c8 to a1fe7ca Compare May 5, 2026 15:58
@delthas delthas marked this pull request as ready for review May 5, 2026 16:02
@scality scality deleted a comment from bert-e May 5, 2026
@delthas delthas force-pushed the improvement/ARSN-572/trace-context branch from a1fe7ca to eb5a535 Compare May 5, 2026 16:04
@delthas delthas requested review from a team, DarkIsDude and benzekrimaha May 5, 2026 16:08
Comment thread lib/models/ObjectMD.ts
Comment on lines +93 to +94
traceparent?: string;
tracestate?: string;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
traceparent?: string;
tracestate?: string;
parentId?: string;
state?: string;

State is a string or a list of possible value ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those names are standard, come from a W3C spec: https://www.w3.org/TR/trace-context/

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@delthas indeed, I saw that we use them in several place in the code but when stored in mongo, under the traceContext structure, it don't make sense anymore to keep trace

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd heavily lean towards keeping the standard names, as they have specific formats, with meanings specified in the specification. Specifically, traceparent is not a plain parentId at all, it is a carefull formatted string, such as traceparent: 00-0af7651916cd43dd8448eb211c80319c-b9c7c989f97918e1-01, which carries a version, a trace ID, a span ID, and flags, including the standard "is sampled" flag. parentId would be outright wrong, but even if we gave it a better name, it would be less precise and more confusing than traceparent, which is exactly the format the consumer would use. For example, see my backbeat code that recreates an OTEL span from saved trace context: tracecontext and traceparent are the field names used by the library.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you prefer to keep it, then let's go. It still don't make sense for me but I'm fine to move forward

Comment thread lib/models/ObjectMD.ts Outdated
Comment thread lib/storage/metadata/mongoclient/MongoClientInterface.ts
Comment thread lib/storage/metadata/mongoclient/MongoClientInterface.ts Outdated
Comment thread lib/storage/metadata/mongoclient/MongoClientInterface.ts
@delthas delthas force-pushed the improvement/ARSN-572/trace-context branch from eb5a535 to 7c5ecce Compare May 6, 2026 14:25
Comment thread lib/storage/metadata/mongoclient/MongoClientInterface.ts
Add a traceContext field to ObjectMD carrying the currently-active
OTEL trace context (W3C traceparent/tracestate). Inject it
automatically at the three metadata write chokepoints where originOp
is set today: internalPutObject, repair, and internalDeleteObject.

The value ends up in the MongoDB oplog; downstream consumers
(backbeat, sorbet) can extract it to continue the trace across the
async boundary, closing the loop on end-to-end tracing for flows
that cross the oplog.

When OTEL is not active on the caller (no SDK, or request outside
a traced context), captureCurrentTraceContext returns undefined,
setTraceContext no-ops, and the field stays absent — zero cost.

Only adds @opentelemetry/api as a runtime dependency (the API-only
surface package, becomes a no-op when no SDK is registered).

Issue: ARSN-572
@delthas delthas force-pushed the improvement/ARSN-572/trace-context branch from 28bca79 to 8619151 Compare May 7, 2026 09:06
@claude
Copy link
Copy Markdown

claude Bot commented May 7, 2026

LGTM

Review by Claude Code

@delthas delthas requested review from SylvainSenechal and removed request for benzekrimaha May 12, 2026 08:36
@scality scality deleted a comment from claude Bot May 13, 2026
@scality scality deleted a comment from claude Bot May 13, 2026
@scality scality deleted a comment from claude Bot May 13, 2026
@scality scality deleted a comment from claude Bot May 13, 2026
@scality scality deleted a comment from claude Bot May 13, 2026
@scality scality deleted a comment from claude Bot May 13, 2026
@scality scality deleted a comment from claude Bot May 13, 2026
@scality scality deleted a comment from claude Bot May 13, 2026
Comment thread lib/models/ObjectMD.ts
* @param tc - W3C trace context carrier
* @return itself
*/
setTraceContext(tc?: TraceContextCarrier) {
Copy link
Copy Markdown
Contributor

@francoisferrand francoisferrand May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is correct.

  • By relying on a call to setTraceContext, some (bad) code may unexpectedly update the metadata without updating the trace context : and we would end up with incorrect trace... (This has happened before with changes being made without calling setOriginOp())
  • Especially, updating metadata (as in "manipulating the ObjectMD") may be done by backbeat, but the actual write request is made by Cloudserver : so the traceContext should be the one at the bottom (in cloudserver) - and will however "derive" from the traceContext in backbeat
  • It may thus be safer to handle this at lower level, like in MetadataWrapper (or even below, in specific backends? would allow correlating more details, like metadata or mongo inner details, but requires to duplicate the code and may break layering/abstraction...)

In addition (and maybe this becomes irrelevant with the above point, or maybe not): I know ObjectMD is mostly written with simple accessors, but this is causing issues. Would be better to allow passing the trace context at the proper point in the lifecycle of the object, and introduce the parameter at that specific point. For exemple, if trace should be passed whenever we make a new MD op, this should be a second parameter of setOriginOp()

Copy link
Copy Markdown
Contributor

@francoisferrand francoisferrand May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, looking below it is applied in mongo client : so correct indeed, but then the problem is more related to abstraction. Setting the trace context SHOULD NOT be done by "users" of ObjectMD (cloudserver and backbeat mostly), but only by metadata backend
→ so we should not have this method.

also it will not be present for other backends -esp. metadata- : should we consider handling this in MetadataWrapper instead of the actual backend? If we stick with current approach (in mongoclientinterface) there should be a ticket to do the same in metadata (and thus probably export applyTraceContext, as it will be used there)

cb: ArsenalCallback<string | void>,
): void {
MongoUtils.serialize(objVal);
applyTraceContext(objVal, captureCurrentTraceContext());
Copy link
Copy Markdown
Contributor

@francoisferrand francoisferrand May 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we always serialize objet before writing it, should we not apply traceContext in serialize ?

* Generic over `data` so callers can use either an ObjectMD's `_data`
* or a raw `ObjectMDData` without type gymnastics.
*/
export function applyTraceContext(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function is always called with captureCurrentTraceContext() in argument: should we not merge them?

objMetadata.setOriginOp(params.originOp);
objMetadata.setTraceContext(captureCurrentTraceContext());
objMetadata.setDeleted(true);
return next(null, objMetadata.getValue());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • serialize() not called here, contrary to the other cases : not sure if this can create issue
  • the whole parsing/serialization for adding 3 fields is probably overkill, when we could simply set the 3 fields manually (obj.originOp = params.originOp ; obj.deleted = true ; ...)
  • we may even be able to use an aggregation pipeline to do this service side and possibly aomatically

not related to this PR, let's not fix it yet ; but please create a ticket to look at this

@scality scality deleted a comment from bert-e May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants